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Abstract: 


The  current  literature  in  program  testing  is  surveyed.  A  strategy 
is  proposed  for  eliminating  categories  of  errors  from  programs. 
Errors  may  be  classified  as  functional  <an  incorrect  input-output 
pair)  or  structural  (an  incorrect  statement).  An  error  is  eliminated 
if  a  successful  program  execution  for  a  given  input  implies  the  pro¬ 
gram  could  not  contain  the  error.  A  "creation  condition"  guarantees 
that  a  structural  error  affects  the  program's  computation.  A  "propa¬ 
gation  condition"  guarantees  that  the  effect  produces  a  functional 
error.  An  error  is  eliminated  whenever  a  computation  satisfies  both 
the  creation  and  the  propagation  condition  and  produces  correct  out¬ 
put. 
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Rarely  has  a  developing  field  rapidly  attained  a  unified  under¬ 
standing  of  itself;  program  testing  is  no  exception.  Three  problems 
must  be  addressed  if  progress  is  to  be  made. 

(1)  How  can  the  quality  of  test  data  be  measured? 

(2)  Ho w  can  the  quality  of  a  testing  strategy  be  measured? 

(3)  What  is  an  appropriate  paradigm  for  program  testing? 

A  positive  answer  to  the  first  question  would  provide  confidence 
in  the  results  of  testing  a  single  program.  For  now<  the  tester  can 
merely  cite  a  few  statistics  (percentage  of  paths  executed>  percentage 
of  branches  executed!  etc.  ).  But  what  is  the  value  of  executing  BOV.  of 
the  paths?  In  what  sensei  if  any,  is  it  better  to  execute  1000  test 
cases  rather  than  100?  Without  an  underlying  theory  statistical 
claims  are  dangerous,  because  they  can  lull  the  tester  into  a  false 
sense  of  security. 

Answering  the  second  question  does  not  automatically  answer  the 
firsts  a  good  strategy  may  sometimes  produce  a  bad  test  set.  The 
characteristics  of  a  good  strategy  could  guide  researchers  into  more 
profitable  areas  It  is  entirely  possible  that  strategies  must  be  spe¬ 
cialized  for  different  program  classes.  How  then  can  the  various 
strategies  be  compared?  Does  it  even  make  sense  to  compare  strategies 
that  cannot  be  used  for  the  same  program?  The  ability  to  determine  a 
test's  quality  does  not  necessarily  imply  the  ability  to  determine  a 
testing  strategy 's  quality,'  this  would  require  inferring  the  quality 
of  a  testing  strategy  from  its  results  on  a  finite  number  of  applica¬ 
tions.  This,  in  essence,  is  using  testing  to  measure  the  quality  of 
testing,  a  rather  dubious  approach  at  best. 

The  third  question  suggests  that  testing  is  mere  than  searching 
for  hidden  program  errors.  Host  strategies  use  what  may  be  called  an 
error  discovery  paradigm;  i.  e. ,  the  ultimate  goal  of  a  testing  stra¬ 
tegy  is  to  generate  inputs  that  show  a  program  is  incorrect.  ^hen  a 
program  executes  successfully,  the  fact  is  recorded  and  the  search 
continues  for  an  input  that  will  reveal  an  error.  Thus,  the  error 
discovery  paradigm  only  allows  the  conclusion  that  a  program  is 
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correct  on  its  testtd  domain.  This  paper  suggests  that  an  error  elim¬ 
ination  paradigm  is  more  appropriate.  An  error  is  eliminated  if  a 
successful  program  execution  for  a  given  input  implies  the  program 
could  not  contain  the  error.  Such  an  approach  allows  the  conclusion 
that  specific  errors  are  not  contained  in  a  program. 

This  paper  discusses  one  wag  of  eliminating  errors  from  programs 
through  the  use  of  creation  and  propagation  conditions.  A  creation 
condition  guarantees  that  a  potential  error  in  the  code  affects  the 
program's  computation.  A  propagation  condition  guarantees  that  the 
effect  produces  an  output  error.  If  the  output  is  correct,  the  poten¬ 
tial  error  did  not  occur,  and  thus  can  be  eliminated  from  the  program. 
Section  2  of  this  paper  surveys  the  best  known  functional  and  struc¬ 
tural  testing  methodologies.  Section  3  discusses  the  results 
currently  known  from  testing  theory.  Section  4  presents  reliability 
theory  and  develops  it  in  the  context  of  error  elimination  as  the  goal 
for  testing.  Section  3  introduces  an  error  elimination  strategy  for 
testing  programs;  it  is  based  upon  the  concept  of  "error  propagation.  " 
The  final  section  proposes  areas  in  which  further  research  seems 
promising. 
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£.  SlITygy  2±  Punamlc  Validation 
2-  1.  Ounamlc  analusis 

Dynamic  analysis  CHow783  Involves  the  execution  of  a  given  pro¬ 
gram  with  specific  test  data.  The  output  is  compared  with  the  specif¬ 
ication  to  decide  correctness.  Test  data  selection  may  be  based  upon 
the  actual  code  or  upon  the  specifications.  The  former  case  is  termed 
"structural  testing"  since  the  structure  of  the  program  is  considered 
in  the  test  data  selection.  The  latter  case  is  termed  "functional 
testing"  because  only  the  input-output  behavior  is  considered  in 
choosing  test  data. 

2-  I-  1-  Structural  Testing 

The  goal  in  structural  testing  is  program  coverage.  If  the  code 
of  a  program  can  be  sufficiently  "exercised"  (or  covered)  it  seems 
reasonable  to  conclude  that  any  incorrect  code  will  manifest  itselfi 
thus  revealing  the  presence  of  an  error.  Miller  CMil743  and  Howden 
CHow783  suggest  the  following  two  structural  coverage  criteria: 

(1)  Statement  coverage  -  Every  statement  should  be  executed.  It  is 
unreasonable  to  expect  that  unexecuted  code  will  perform 
correctly  when  executed. 

<2)  Path  coverage  -  Every  path  in  the  program  should  be  executed. 
"Path"  is  defined  to  be  any  possible  flow  of  control  through  an 
un interpreted  flowchart.  Thus  a  path  from  a  given  flowchart  may 
not  in  fact  be  executable  due  to  the  particular  conjunction  of 
conditions  "guarding"  the  path.  Howden  calls  such  a  path 
Infeasible  CHow763. 

Clearly  path  coverage  is  impossible  for  any  program  containing  a 
loop  with  a  run  time  determined  exit  condition  since  each  repetition 
of  the  loop  determines  a  new  path.  Various  approx imations  to  path 
coverage  are  suggested  to  reduce  the  problem  to  manageable  size. 
Among  these  are  branch  testing  and  path  equivalence  classes. 

(1)  Branch  testing.  One  approx imat ion  to  path  coverage  is  to  ensure 
that  all  potential  branches  ire  executed.  "Potential  branches" 
has  been  alternately  defined  to  mean  "the  potential  outcomes  of 
a  given  conditional"  or  "the  means  by  which  those  outcomes  can  be 
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obtained. “  The  difference  arises  in  compound  conditional  such  as 
A  v  B  where  the  potential  outcomes  would  merely  require  A  v  8  to 
evaluate  in  one  case  to  T  and  in  another  case  to  F.  The  true 
outcomes  may  be  obtained  from  several  different  cases  such  as  A  = 
T.  B  *  F  and  A  ■  F  .  B  *  T  as  wall  as  A  *  L  S  s  T.  A  simple 
resolution  of  the  difference  is  to  require  branch  testing  to  be 
performed  on  modified  programs  in  which  all  compound  conditional 
are  expanded  into  simple  conditionals. 

(2)  Path  eauival enc*  classes.  For  the  infinite  set  of  paths  in  a 
given  program/  paths  may  be  equated  which  share  various  struc¬ 
tural  criteria.  For  example,  level  testing  equates  paths  that 
have  the  same  depth  of  nesting  within  a  program  as  determined 
from  the  static  code  CMil743.  This  technique  aims  at  testing 
nested  paths.  thereby  guaranteeing  coverage  of  all  decision-to- 
decision  paths  in  the  program.  A  corresponding  dynamic  path 
equivalence  relation  equates  paths  containing  at  most  n  itera¬ 
tions  of  all  loops. 

The  inadequacy  of  structural  testing  is  shown  by  the  following 
incorrect  solution  fnr  computing  the  maximum  of  a  and  b. 

if  a  >  b 

then  max  : »  a 

else  max  : ■  -a 

The  test  {(a»l,b«-l).  (a*>-l.b»l)>  satisfies  all  the  structural  cri¬ 

teria  given,  (all  statements  are  executed,  all  branches  are  taken,  all 
paths  are  executed),  yet.  the  error  is  not  evidenced  for  this  particu¬ 
lar  test. 

As  a  result  of  pernicious  examples  like  this,  more  refined  struc¬ 
tural  criteria  have  been  proposed  which  require  more  detailed  dif¬ 
ferentiation  by  the  test.  These  areas  may  be  broadly  defined  as  muta¬ 
tion  testing  CDsM783  CBud803.  function  testing  CFos781  CHowSOl,  and 
domain  testing  CZei803.  Each  of  these  makes  additional  assumptions 
about  the  nature  of  the  program  design,  structure,  or  execution 
behavior.  With  these  assumptions  a  greater  refinement  of  test  cases 
is  possible,  resulting  in  a  greater  "exercise"  of  the  program. 
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Hutation  testing  CDeM783  assumes  the  "competent  programmer 
hypothesis- “  namely  that  a  competent  programmer  under  normal  condi¬ 
tions  will  produce  code  that  is  close  to  being  the  correct  code. 
Labeling  the  programmer's  code  as  P  and  the  correct  code  as  P*»  it  is 
reasonable  to  assume  that  relatively  few  syntactic  changes  in  P  will 
result  in  P*.  Alternately!  P*  has  many  "mutants"  that  are  quite  close 
syntactically.  A  test  set  is  considered  reliable  if  it  d i f f erent iates 
P*  from  all  of  its  mutants.  A  mutant  is  differentiated  when  it  exe¬ 
cutes  incorrectly  on  a  given  test  set.  in  which  case  the  mutant  is 
said  to  be  "killed. "  If  all  reasonable  mutants  are  killed  by  a  given 
test  set.  correct  operation  on  that  test  set  implies  the  program  con¬ 
tains  no  "unreasonab le"  errors.  If  the  competent  programmer 
hypothesis  holds,  the  test  set  is  reliable  since  competent  programmers 
produce  only  reasonable  mutants. 

To  limit  the  number  of  mutants,  it  is  necessary  to  restrict  the 
types  and  combinations  of  changes  allowed.  Common  restrictions  are  to 
allow  replacing  expressions  with  limited  size  expressions,  to  disallow 
inserting  of  arbitrary  statements,  and  to  disallow  making  arbitrary 
changes  in  the  flowgraph.  It  is  cogently  argued  CDeMTBl  that  a  test 
set  which  kills  single  mutants  will  also  kill  double  mutants  Empiri¬ 
cal  studies  CBudSO]  involving  mutation  testing  have  shown  it  to  be 
quite  effective  as  well  as  quite  expensive,  since  a  large  number  of 
mutants  must  be  generated  and  executed.  Hamlet's  system  CHam771 
reduces  this  time  by  executing  compiled  code  up  to  the  chosen  point  of 
mutation,  and  then  successively  trying  each  mutant  Each  system  faces 
two  theoretical  problems.  namely,  what  happens  when  the  mutant  does 
not  halt  within  a  specified  period  of  time,  and  what  happens  when  the 
program  does  halt  with  correct  output.  In  the  first  case,  an  arbi¬ 
trary  time  limit  must  be  invoked,  usually  a  function  of  the  running 
time  of  the  original.  In  the  second  case,  a  human  must  ultimately 
intervene.  If  the  mutant  is  not  the  same  as  the  original  (as  in  the 
case  of  an  algebraic  simplification),  then  the  test  set  must  be  aug¬ 
mented.  The  process  begins  again  until  all  mutants  are  killed  (or 
shown  equivalent  to  the  correct  program). 

Foster  has  proposed  a  method  that  may  be  called  function  testing 
in  which  he  gives  criteria  for  testing  specific  program  constructs  for 
typical  errors  CFos783.  Howden  has  generalized  this  to  make  test 
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cases  sensitive  to  potential  errors  in  any  of  the  primitive  semantic 
functions  supported  by  a  programming  language.  For  instance.  consider 
a  language  in  which  each  variable  has  two  associated  functions*  STORE 
(var. value)  and  RETRIEVE  (var> value).  The  variable's  RETRIEVE  function 
is  invoked  whenever  the  variable  must  be  evaluated;  the  variable's 
STORE  function  is  invoked  whenever  the  variable  is  assigned  a  value. 
If  a  mutation  occurs  that  substitutes  one  variable  for  another  vari¬ 
able  in  an  expression*  then  the  wrong  RETRIEVE  function  would  be 
invoked  when  that  expression  is  evaluated.  If  the  test  set  estab¬ 
lishes  at  the  mutation  point  different  values  for  all  variables.  then 
the  wrong  RETRIEVE  function  would  introduce  an  incorrect  value  into 
the  evaluation  of  the  expression.  If  the  effect  of  this  incorrect 
value  propagates  to  the  output  then  the  error  will  be  manifested. 
This  test  set  is  in  some  sense  reliable  for  discovering  errors  involv¬ 
ing  the  use  of  a  wrong  variable  in  an  expression.  Howden  extends  this 
to  considerably  more  complex  functions  commonly  occurring  in  a  pro¬ 
gramming  language.  The  method  can  potentially  eliminate  an  entire 
category  of  mutants  on  a  single  execution.  Its  weakness  lies  in  not 
guaranteeing  that  potential  errors  manifest  themselves. 

Linear  domain  testing  CZei803  is  an  application  of  theoretical 
ideas  on  path  testing  given  by  Howden  CHow761.  Each  path  can  be 
uniquely  characterized  by  a  subset  of  the  input  space  called  the  path 
domain.  A  program  contains  a  domain  error  if  an  incorrect  path  is 
followed  for  an  input  and  produces  incorrect  output.  An  incorrect 
computation  along  a  path  is  called  a  computation  error.  Domain  testing 
therefore  is  a  version  of  path  testing.  A  1  inear l u  domained  procram  P 
satisfies  the  following; 

(1)  An  input  cannot  follow  an  incorrect  path  and  produce  correct  out¬ 
put. 

(2)  No  paths  are  missing  from  P 

(3)  The  input  space  for  P  is  continuous. 

<4>  P  contains  no  compound  predicates. 

<5)  Adjacent  domains  compute  different  functions. 

<6)  Each  predicate  in  the  program  is  a  linear  transformation  of  the 
program  inputs. 

Certain  predicate  errors  may  not  be  detectable  by  testing  a  par¬ 
ticular  path  For  example*  it  is  impossible  to  determine  if  "3"  :s  the 
correct  constant  in  the  predicate  "x  *  3*y  <  0"*  for  a  path  in  wnich  y 
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has  a  constant  value  of  0.  Such  situations  may  arise  when  a  path 
assigns  O  to  g  (“assignment  blindness")  or  selects  only  O-valued  y's 
(“equality  blindness").  Since  assignment  and  equality  blindness  are 
character istics  of  the  path  up  to  a  predicate,  no  amount  of  testing  of 
the  path  can  eliminate  the  possibility  of  certain  errors  in  the  predi¬ 
cate.  Thus,  every  path  containing  a  predicate  implicitly  defines  a 
set  of  errors  that  cannot  be  eliminated  from  the  predicate  by  testing 
that  path.  The  errors  in  a  given  predicate  that  cannot  be  eliminated 
by  testing  a  collections  of  paths  is  the  intersection  of  all  the  non- 
detectable  errors  determined  by  each  path  in  the  collection.  Thus, 
testing  an  additional  path  is  useful  only  if  the  non-detectab le  errors 
for  the  new  path  does  not  contain  the  intersection  of  all  the  non- 
detectable  errors  for  the  paths  already  tested. 

In  thr  case  of  a  linearly  domained  program,  both  paths  and  predi¬ 
cates  can  be  modeled  as  linear  transformations.  Let  C  be  the 
transf ormation  for  a  path  up  to  a  predicate,  let  T  be  the  transforma¬ 
tion  for  the  predicate,  and  let  T'  =  T  +  E  be  an  erroneous  version  of 
T.  The  transf ormation  T'  may  not  detectably  different  from  T '  because 
TC  ■  T'C.  or  equivalently,  EC  •  1  ( Z  is  the  zero  vector).  Solving  EC 
*  2  for  values  of  E  yields  those  predicate  errors  which  are  not 
detectable  due  to  assignment  blindness.  Predicate  errors  may  also 
remain  undetected  whenever  EC  O  Z  but  ECv  *  O  for  all  v  in  the  path 
domain.  Solving  this  equation  for  E  yields  those  predicate  errors 
which  are  not  dectable  due  to  equality  blindness. 

£•  1-  £  Functional  Testing 

In  functional  testing  test  case;  are  selected  to  exercise  the 
specifications  rather  than  the  code  itself.  This  is  sometimes  termed 
a  “black  box"  approach  to  testing  since  the  code  is  ignored  as  a 
source  of  information  for  selecting  test  data.  The  program's  function 
is  the  only  concern;  if  the  program  satisfies  the  sp ec i f icat i on  it  is 
correct  and  coverage  criteria  are  unnecessary.  Of  primary  concern  are 
the  special  values  for  each  input  variable  given  by  the  specification. 
Test  points  are  selected  to  ensure  that  values  are  input  for  both 
extremal  and  non-extremal  points  as  well  as  special  values  of  every 
variable.  This  quickly  results  in  a  c omb inator i a  1  explosion,  con¬ 
trolled  by  partitioning  and  refining  the  overall  requirements  for  the 


cod*.  Partitioning  associates  inputs  that  are  closely  relationed  to 
on*  another;  refining  associates  particular  functions  and  the  code 
that  implements  them.  Houiden  CHow80bI  provides  an  excellent  overview 
of  functional  testing. 

The  most  general  specification  available  for  functional  testing 
is  the  requirements  document  that  specifies  overall  system  operation. 
Testing  requirements  involves  selecting  test  points  that  aim  at  deter¬ 
mining  overall  satisfaction  of  the  system  goals.  Details  of  how  the 
function  is  computed  are  ignored;  an  attempt  is  made  to  handle  the 
different  combinations  of  possible  input  categories.  Consider,  for 
example,  a  file  system.  There  may  be  requirement  that  a  COPY  does  not 
destroy  the  original  file.  Such  a  requirement  may  be  tested  without 
regard  for  where  or  how  files  are  stored.  In  designing  the  system, 
decisions  are  made  on  how  to  represent  a  particular  file  type.  These 
decisions  imply  that  certain  functions  may  be  necessary  to  implement 
the  COPY  operation;  these  are  termed  “design  functions. "  Testing  of 
these  individual  functions  may  be  done  in  the  same  manner  as  the  test¬ 
ing  of  the  COPY  requirement,  but  on  a  smaller  scale.  Even  more 
detailed  design  functions  may  be  specified  at  a  lower  level.  In  this 
manriBr,  the  c omb  1  nator ia  1  problems  are  somewhat  decreased.  By  identi¬ 
fying  various  abstractions  that  are  present  in  the  input  data  it  is 
possible  to  further  reduce  the  number  of  combinations. 


£  £•  Sumbol ic  Execution 

Before  leaving  this  survey  of  testing  methodologies,  it  is 
appropriate  to  comment  upon  a  hybrid  between  testing  and  formal  verif¬ 
ication  called  symbolic  execution  CHow773  CCia771  and  CHan761  Formal 
verification  requires  a  proof  of  various  mathematical  properties  to 
demonstrate  correctness;  symbolic  execution  aids  :n  the  proofs  of 
these  properties  by  allowing  execution  of  the  program  with  symooiic 
data.  This  is  one  step  beyond  data  flow  systems  such  as  DAVE  CDstTa] 
in  which  the  program  is  abstracted  into  a  flowgraph  with  movement  of 
data  along  the  paths.  The  entire  semantics  of  the  programming 
language  must  be  at  hand  to  enable  complete  interpretation  of  the  pro¬ 
gram  during  symbolic  execution  This  enables  the  construction  of  path 
conditions  (the  sequence  of  decisions  made  along  a  path;  which  can  be 
an  aid  in  documenting  the  program  and  in  determining  act.al  test  Jata 


that  can  satisfies  tha  path  condition.  Formal  varification  is  aidad 
in  providing  a  description  of  the  output  in.  terms  of  tha  input  (and 
possible  constants)  which  than  need  to  be  shown  to  satisfy  the  output 
condition.  The  input  condition  aids  in  determining  what  branches  may 
be  chosen,  either  beforehand  as  in  DISSECT  CHow771  or  interactively  as 
in  EFFIGY  CHan 761. 

Symbolic  execution  systems  which  attempt  to  deduce  the  value  of 
conditionals  are  only  as  strong  as  the  theorem  provers  upon  which  they 
rely.  The  inability  of  a  theorem  prover  to  decide  the  value  of  a  con¬ 
ditional  does  not  guarantee  that  the  value  cannot  be  decided.  Thus, 
human  input  may  be  required  more  frequently  than  necessary.  Simi¬ 
larly,  having  the  path  condition  determined  is  of  little  value  if  test 
cases  cannot  be  automatically  generated  to  satisfy  the  condition. 
Since  such  generation  is  impossible  (as  discussed  in  the  next  section 
of  this  paper),  the  system  must  again  rely  upon  human  input.  The  path 
condition  is  frequently  so  complex,  that  it  is  often  easier  for  a  per¬ 
son  to  generate  the  test  data  from  the  code  rather  than  the  condition. 
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2  Development  2 1  IllllCfl 

The  development  of  testing  theory  has  followed  mostly  two  direc¬ 
tion*,  one  of  general  unsolvability  and  one  of  solvability  over  par¬ 
ticular  classes  of  programs.  General  unsolvability  results  CHen773 
rely  heavily  on  recursive  function  theory  and  deal  with  automatic  gen¬ 
eration  of  test  sets.  With  the  general  results  rather  dismal, 
specific  exceptions  have  been  investigated.  The  search  for  classes  of 
programs  in  which  testing  is  tantamount  to  formal  verification  is  an 
open  area  of  research. 

General  unsolvability  results  in  testing  theory  ultimately  lie 
close  to  the  heart  of  recursive  unsolvability,  the  halting  problem. 
Formal  proofs  of  the  results  in  this  section  may  be  found  in  many 
excellent  sources  CHam743  CHen773;  the  presentation  here  will  be  from 
a  testing  viewpoint.  First,  we  need  some  notation  and  a  few  simple 
definitions,  as  taken  from  Cl_in793. 

Notation:  If  P  is  a  Program  then  CPI  denotes  the  function  that 

P  computes.  The  output  of  P  on  input  x  may  be  written  as  CP3(x),  if 
CPI  is  defined  for  input  x.  Dom<CPl)  denotes  the  domain  of  CPI. 
"In"  designates  set  membership  and  designates  set  intersection. 

Definition  A  soec if ication  Sis  a  set  of  ordered  pairs  satisfying  the 
foil  owing : 

a.  S  is  recursive. 

b.  dom<S)  is  recursive 

Definition  A  program  P  is  said  to  be  c orrect  with  respect  to  a  specif¬ 
ication  S  iff 

dom  <CP3  *  S)  ■  dom  <S) 

Definition  A  test  set  is  a  subset  of  the  domain  of  a  specification. 
Definition  A  program  P  satisfies  &  specification  §.  gn  a  test  set  T  iff 
For  All  x  In  T,  CPJ(x)is  defined  and  (x,  CP3(x>>  In  S 

The  following  classic  theorems  from  recursive  function  theory  are 
included  for  completeness  sake.  For  proofs  see  CHen773  or  CHam741. 
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Theorem  (Halting  Problem)  Let  P  .  P  ,  .  ,  .  be  an  effective  enumera¬ 
tion  of  all  programs  (sag  by  their  lexical  order).  There  does  not 
exist  a  program  P  satisfying  the  following: 


CP3 ( x  > 


1  if  CP  ]<x)  is  defined 
x 

0  if  CP  ]<x>  is  not  defined 
x 


Theorem  (Program  Equivalence  Problem)  There  does  not  exist  a  pro¬ 
gram  P  satisfying  the  following: 


CP]  (  xi  y)  = 


1  if  CP  ]  -  CP  ] 

*  y 

0  otherwise 


We  obtain  almost  immediately  from  the  above  theorems  the  following 
result: 

Coro  liar u 

There  does  not  exist  a  program  that  generates  or  recognizes  a  test 
set  T  that  satisfies  any  of  the  following  properties  (for  all  programs 
Pi  specifications  Si  paths  p*  statements  Si  expressions  e.  and  values 
v): 

a.  P  satisfies  S  on  nonempty  T 

b.  Path  p  of  P  is  executed  by  T 

c.  Statement  i  of  P  is  executed  by  T 

d.  Expression  e  in  P  evaluates  to  value  v 

e.  P  satisfies  S  on  a  nonempty  subset  of  T 

The  results  from  above  lead  to  a  Murphy-like  rule  for  the  results 
of  testing  theory(  namelyi  if  a  desired  result  is  powerful  and  gen¬ 
erally  applicable  then  it  cannot  be  obtained.  Since  weaker  results 
are  not  usually  desiredi  to  maintain  strength  it  is  necessary  to 
reduce  applicability.  Hencei  whereas  the  corollary  gives  a  gloomy 
general  forecast!  for  specific  clashes  of  programs  and  specifications 
all  the  results  are  obtainable.  Three  prominent  examplasi  CBudSO]. 
CTsi70]i  and  CHow78b](  completely  characterize  the  program  function  by 
a  finite  set  of  tests  and  a  few  restrictions  abouc  the  program  struc¬ 
ture. 
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Early  work  that  has  bearing  upon  testing  programs  from  a  particu¬ 
lar  class  comes  from  complexity  theory  based  on  the  LOOP  hierarchy 
CMey673  and  further  analyzed  by  Tsichritzis  CTsi703.  Briefly>  a  loop 
program  consists  of  assignment  statements 

<assign>  : :*  <var>  :■  <exp> 

<exp>  :  :  =>  <var>  !  <var>  1  !  0 

and  loop  statements 

<loop>  LOOP  <var>  <assign>  END 

When  control  reaches  a  loop  statement  the  <var>  is  evaluated  to  a 
non-negative  value  and  the  list  of  assignment  statements  is  then  exe¬ 
cuted  that  number  of  times.  Arbitrary  nesting  of  loop  statements  is 

allouied.  Loop  is  exactly  that  class  of  LOOP  programs  with  only 
0 

assignment  statements.  LOOP  programs  Ci>0>  are  LOOP  programs  in 

i 

which  the  maximum  nesting  level  is  i.  thus,  LOOP  syntactical ly 

i  +  1 

contains  all  LOOP  programs.  Meyer  and  Ritchie  CMey673  have  shown  that 

i 

LOOP  properly  contains  LOOP  programs  semantically  as  well.  Thus 

i+1  i 

there  are  some  LOOP  program  functions  that  are  not  computable  by 
LOOP  programs.  Furthermore,  the  infinite  union  of  functions  comput¬ 
able  by  the  LOOP  programs  is  exactly  the  class  of  primitive  recursive 
functions  and  thus  the  hierarchy  of  functions  computed  by  LOOP  pro¬ 
grams  forms  a  hierarchy  of  primitive  recursive  functions. 

Tsichritzis  CTsi703  investigated  the  first  two  levels  of  the  LOOP 
hierarchy  to  show  that  LOOP  programs  correspond  to  a  subclass  of 
primitive  recursive  functions  called  simple  functions.  His  result  for 
testing  theory  is  that  a  finite  set  of  input-output  pairs  uniquely 
determines  a  simple  function.  He  provides  an  upper  bound  on  the  size 
of  the  test  set  which  is  computable  from  the  simple  function.  Hence 
the  size  can  be  functionally  related  to  the  structure  of  the  LOOP 

1 

program  since  the  determination  of  the  simple  function  computed  is 
mechanical.  The  significance  is  that  LOOP  forms  a  class  of  programs 
which  has  an  algorithm  for  generating  a  test  set  that  proves  the  pro¬ 
gram  correct. 

Two  instances  of  program  classes  for  which  the  above  corollary 
has  a  solvable  counterpart  are  given  in  CHowTSbl  and  C3udS03  The 
first  class  is  the  set  of  programs  characterized  by  the  functions  they 
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compute.  multinomials.  The  second  class  is  a  subset  of  LISP  programs 
that  satisfy  a  particular  recursive  schema.  The  techniques  used  to 
generate  data  for  these  classes  are  not  readily  extendible  to  other 
program  classes  because  both  rely  upon  the  mathematical  properties  of 
the  functions  being  computed. 

4.  Reliabilltu  Theoru 

In  attempting  to  provide  a  firmer  foundation  for  testing  (and  to 
allow  it  to  be  called  a  "theory")  a  reliability  theory  has  been 
developed  for  testing  computer  programs.  This  theory  encompasses  the 
results  of  the  previous  section>  and  provides  a  framework  for  evaluat¬ 
ing  the  various  ad  hoc  testing  strategies  mentioned  earlier  in  the 
paper  by  relating  the  notion  of  correctness  to  that  of  thoroughness  of 
a  test  set.  Since  test  sets  are  essentially  finite  (excluding  symbolic 
evaluations  a  reliable  test  set  must  somehow  capture  the  essence  of 
the  program  on  a  finite  domain.  The  results  from  the  previous  section 
certainly  imply  that  such  sets  cannot  be  algor ithmical ly  constructed 
or  recognized  except  for  certain  classes  of  programs.  Reliability 
theory  has  therefore  concentrated  on  ways  in  which  test  sets  can  be 
identified  for  particular  classes  of  programs. 

±  L  Stilly.  A^fmg^g 

Gerhart  and  Goodenough  first  attempted  to  provide  a  theoretical 
basis  for  testing  CGoo731.  A  test  selection  criterion  C  is  said  to  be 
re  1 iab le  if  and  only  if  all  sets  that  satisfy  the  criterion  either 
prove  the  program  incorrect  (by  failing  to  meet  the  specifications)  or 
satisfy  the  specification.  A  test  selection  criterion  C  is  said  to  be 
valid  if  and  only  if  for  every  error  point  there  is  a  test  set  that 
satisfies  the  criterion  C  and  proves  the  program  incorrect  From 
these  two  definitions!  Gerhart  and  Goodenough  prove  their  fundamental 
theorem!  namely!  if  a  reliable  test  that  satisfies  the  specifications 
of  a  program  is  also  validi  then  the  program  is  correct.  Thus,  the 
job  of  the  tester  is  to  demonstrate  that  a  given  criterion  is  both 
reliable  and  valid.  Thereafter,  one  successful  execution  on  a  test 
set  satisfying  the  criterion  proves  the  program.  In  some  cases  it  is 
trivial  to  prove  either  reliability  or  validity'  but  rarely  is  it 
trivial  to  prove  both.  In  fact,  as  shown  by  CWey803.  if  a  test  selec¬ 
tion  criterion  is  not  valid  it  must  be  reliable  and  if  it  i*  not 
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reliable  it  must  be  valid.  Indeed,  if  C  is  an  invalid  criterion,  than 
thara  is  a  point  for  which  tha  program  is  wrong  and  no  tast  sat  dis- 
covars  this.  Hence,  all  tha  tast  sats  imply  tha  program  might  ba 
corract  and  tharafora  tha  critarion  is  raliabla.  If  tha  critarion  is 
not  raliabla>  than  soma  of  tha  tast  sats  satisfying  tha  critarion 
disprova  tha  program  Thus»  for  every  point  thara  is  a  tast  sat  that 
provas  tha  program  incorract  and  tha  tast  salaction  critarion  C  is 
tharafora  valid. 

In  contrast  to  this  thickat  of  intartwinad  definitional  Howdan 
CHow763  and  others  have  espoused  tha  following  definition  of  reliabil- 
i  ty : 


Definition  A  test  sat  is  reliable  for  a  program  P  with  respect  to  a 
specif ication  S  iff 

P  satisfies  S  on  T  »«>  P  satisfies  S  on  dom(S>. 

Tha  distinction  made  between  reliable  and  valid  are  affectively 
combined  into  one  notion  that  still  allows  correctness  to  be  con¬ 
cluded#  but  at  a  rather  strong  price.  Tha  cost  is  found  in  having  to 
verify  that  the  correctness  of  tha  program  does  follow  from  its 
correctness  on  a  finite  domain.  Howden  analyzed  path  testing  in  tha 
light  of  this  definition  and  showed  that  rather  strong  assertions  must 
ba  proved  about  the  program  if  path  tasting  is  to  ba  reliable. 
Categorizing  errors  into  computation  errors  (incorrect  computation  on 
a  given  path),  domain  errors  (incorrect  path  selection),  and  case 
errors  (missing  paths)#  he  was  able  to  show  sufficient  conditions 
under  which  path  tasting  is  reliable  for  two  of  these  errors,  assuming 
compound  errors  do  not  occur.  Tha  results  are  as  follows: 

(1)  Computation  errors  -  All  members  of  the  path  domain  for  a  path 
containing  tha  error  produce  incorract  output 

(2)  Domain  errors  -  Tha  path  domain  for  tha  correct  and  incorrect 
program  share  no  points  in  common.  Furthermore,  the  computed 
function  along  each  path  is  assumed  to  be  different  This 
prevents  an  input  from  following  an  incorract  path  and  still  pro¬ 
ducing  the  corract  output. 
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<3)  Cast  errors  -  Howden  Incorrectly  identified  these  with  domain 
errors.  resulting  in  path  testing  being  reliable  for  case  errors 
iff  useless  code  exists  in  the  program  text. 

As  can  be  seen  from  Houden's  results.  even  assuming  that  all 
paths  can  be  tested,  reliability  is  simply  too  strong  a  requirement  to 
determine  by  testing. 

4.  2.  grr.gr  SfliaLilUUl 

Weakening  the  notion  of  reliability  either  requires  narrowing  the 
class  of  programs  that  will  be  considered  or  reducing  the  requirements 
of  correctness.  Linear  domain  testing  CZei801  is  an  example  of  the 
former  and  mutation  testing  CDeM783  is  an  example  of  the  latter. 
Recently  Howden  CHow803>  Foster  CFos783.  Ostrand  CWeySOl.  and  Weyuker 
CWley 81 3  have  proposed  methods  which  can  be  labeled  erroi — based  testing 
strategies.  The  goal  is  to  demonstrate  the  absence  of  certain  prede¬ 
fined  errors  rather  than  (necessarily)  the  correctness  of  the  program. 
Test  data  is  selected  to  enable  errors,  if  present.  to  be  revealed 
CUey803<  provided  the  execution  of  the  program  does  not  prevent  an 
error  from  being  manifested.  Thus,  error-based  testing  is  an  example 
of  reducing  the  requirement  of  correctness  to  weaken  the  notion  of 
reliability.  The  following  is  a  definition  of  modified  reliability: 

Definition  7  A  test  set  T  is  E-re  1 iab le  (Error  reliable)  for  a  program 
P  and  specification  S  iff 

P  satisfies  S  on  T  - >  P  contains  no  errors  of  type  E. 

It  should  be  noted  the  concept  of  error  type  used  in  this  defini¬ 
tion  is  as  yet  undefined.  In  the  next  section  “error"  is  shown  to 
have  two  distinct  usages,  namely  to  reflect  the  incorrect  operation  of 
the  program  <a  functional  error)  or  to  pinpoint  the  location  of  the 
error  in  the  code  (a  structural  error). 

1-  2-  Errors 

To  gain  a  deeper  understanding  of  various  testing  methodologies 
it  is  necessary  to  understand  eech  methodology's  concept  of  error. 
There  are  two  general  vantage  points  from  which  errors  may  be 
approached.  one  structural  and  one  functional.  In  the  structural 
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approach  an  error  is  considered  to  ba  associatad  u/ith  tha  text  of  tha 
program,  for  example,  an  incorract  conditional  that  causas  soma  inputs 
to  follow  an  undesired  path.  In  tha  functional  approach  an  arror  is  a 
program-computed  input-output  pair  not  satisfying  tha  spacif ications. 
In  this  approach  no  mantion  is  mada  of  how  tha  output  is  computad. 
Tha  diffaranca  batwaan  tha  two  is  evidenced  whan  an  input  follows  an 
undasirad  path  but  producas  the  correct  output.  In  this  case  tha  pro¬ 
gram  has  a  structural  arror  but  a  functional  arror  has  not  been  mani¬ 
fested.  (It  must  ba  tha  case  however  that  a  functional  arror  can  ba 
manifested  on  soma  other  input,  or  tha  "error"  is  not  one  at  all.  > 
Both  approaches  to  error  have  their  advantages  and  d isadvantages  and  a 
corresponding  range  of  applicability. 

Howdan  CHow763  uses  a  structural  concept  of  arror.  in  that  an 
error  is  within  a  particular  program  and  hence  can  ba  spoken  of  as 
being  a  particular  expression,  within  a  particular  statement.  on  a 
given  path.  ate.  Such  a  structural  approach  is  intuitively  satisfying 
since  it  emphasizes  that  incorrect  operation  of  a  program  ultimately 
lias  in  some  portion  of  tha  program  text.  Correcting  the  arror  there¬ 
fore  naturally  translates  into  transforming  tha  program  text.  Hence 
to  identify  that  portion  of  the  program  as  an  error  seams  natural. 
This  approach  has  two  daf iciancias.  especially  whan  tha  concept  of 
errors  is  used  to  compare  various  tasting  methodologies.  First,  a 
structural  approach  to  error  definition  is  more  applicable  to  pro¬ 
cedural  rather  than  functional  languages.  since  in  the  former  the 
location  of  a  given  error  provides  considerably  more  information  than 
tha  latter  (e. g.  tha  type  of  the  expression  values,  possible  paths, 
etc).  In  self  modifying  languages  such  as  LISP  and  SNOBOL.  statements 
may  ba  executed  which  do  not  even  exist  in  the  source  coda.  Second,  a 
structural  approach  makes  tha  correspondence  between  specification  and 
correctness  difficult  to  state.  For  indeed,  it  may  be  quite  clear 
that  a  program  has  failed  to  meet  a  specification  by.  say.  terminating 
with  incorrect  output  for  a  valid  input.  Yet.  it  is  inappropr late  to 
speak  of  "the"  program  error  since  such  an  error  may  actually  involve 
the  compound  result  of  several  statements,  none  of  which  is  wrong  in 
and  of  itself,  but  all  are  wrong  as  a  whole 


A  concept  of  error  that  avoids  the  above  problems  with  the  struc¬ 
tural  approach  lies  in  the  operation  rather  than  the  structure  of  the 
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program.  Such  an  approach  mag  b e  termed  "functional"  bacausa  it  daals 
with  tha  meaning  of  the  program  at  expressed  bg  its  input-output 
behavior.  In  a  functional  approach  an  error  is  associated  with  the 
input-output  behavior  of  the  program  as  determined  bg  the  specifica¬ 
tions.  An  error  occurs  when  a  given  input  produces  an  incorrect  out¬ 
put  »  such  an  input  is  labeled  as  being  in  error.  In  actuality,  the 
error  is  the  incorrect  functioning  of  the  program  over  some  subset  of 
its  input  space.  Two  different  programs  in  different  programming 
languages  can  in  this  sense  contain  the  same  error  —  they  produce  the 
same  incorrect  output  without  regard  to  the  syntactic  constructs  that 
encode  the  error.  Thus,  a  functional  concept  of  error  allows  error 
analysis  across  programming  languages,  something  difficult  to  achieve 
within  a  structural  concept.  Also.  a  functional  view  allows  the 
correspondence  of  errors  and  program  correctness  to  be  clearly  stated. 
To  describe  a  program  error  in  the  functional  sens#  means  to  describe 
a  set  of  inputs  that  produce  wrong  results.  With  such  a  description 
it  is  possible  to  locate  the  structural  construct  that  encodes  the 
error  with  a  good  possibility  of  seeing  how  to  correct  it.  The 
reverse  is  not  true,  however,  since  being  told  that  given  set  of 
statements  is  wrong  requires,  in  essence,  the  reconstruction  of  the 
functional  error  category  from  the  specification  to  enable  the  error 
to  be  corrected. 

To  gain  a  better  understanding  of  the  various  testing  methodolo¬ 
gies.  it  is  useful  to  see  what  kinds  of  functional  and  structural 
errors  each  reveals.  We  have  seen  already  that  a  useful  structural 
categorization  of  errors  is  that  of  computational,  domain,  and  case 
errors.  Functional  error  categories  have  not  been  so  clearly  del¬ 
ineated.  but  may  be  inferred  from  the  types  of  tests  done  in  func¬ 
tional  testing.  If  the  competent  programmer  hypothesis  applies  to  the 
function  implemented  as  well  as  the  code  produced,  we  may  conclude 
that  functional  errors  occur  as  slight  perturbations  of  the  specifica¬ 
tion. 

Errors  in  a  function  may  be  categorized  as  follows: 

<1)  Boundary  conditions  -  The  function  may  be  incorrect  on  boundary 

points  of  the  specified  ranges  of  the  input  variables. 
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(2)  Improper  subfunction  selection  -  The  function  may  involve  the 
computation  of  several  subf unc tions.  some  of  which  may  be  invoked 
at  an  improper  time. 

(3)  Improper  abstract  relationships-  The  spec i f icat i on  treats  certain 
input  variables  as  an  abstraction.  The  function  may  group  the 
wrong  variables  in  attempting  to  implement  the  abstraction. 

(4)  Special  values  -  Values  like  0.  1#  NULL  often  carry  multiple 

meanings  across  data  types.  The  function  may  be  incorrect  at 

these  values. 

It  is  important  to  note  that  these  are  errors  in  the  sense  of  the 
implemented  function  (the  input-output  pairs)  and  not  in  the  sense  of 
the  location  in  the  program. 

Clearly.  no  method  is  ideal  for  discovering  all  errors.  Further- 
more>  every  method  specializes  in  finding  particular  errors.  Thus,  it 
is  frequently  suggested  that  a  viable  testing  strategy  is  to  combine 
several  of  the  above  structural  and  functional  methods  to  achieve 
greater  coverage  of  error  categories.  With  more  categories  covered  it 
is  reasonable  to  assert  that  more  errors  will  be  discovered,  thus 
increasing  confidence  in  the  correctness  of  the  program. 


f 
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5.  Error  Prop aoat  1  on  and  Elimination 

To  maximize  error  coverage  in  testing.  much  current  research  has 
focussed  upon  how  to  combine  validation  techniques  that  cover  dif¬ 
ferent  error  categories.  One  such  combination  is  proposed  here  with 
examples.  It  involves  a  hybrid  of  verification  and  testing  in  which 
testing  is  used  to  establish  the  preconditions  for  a  proof  which 

essentially  states  that  given  an  error,  it  will  propagate  to  the  out¬ 
put  of  the  program. 

Proposals  for  combining  testing  and  formal  verification  have 
appeared  several  times  CGoo751,  CGer761,  and  CGel783.  Primarily  the 
focus  has  been  on  how  to  simplify  formal  verif ication.  The  finite 
nature  of  testing  suggests  that  testing  could  be  used  to  prove  the 
basis  step  for  some  of  the  inductive  proofs  necessary  in  formal 

verification  CGoo751.  The  difficult  nature  of  theorem  proving,  sug¬ 
gests  that  only  tested  programs  should  be  proved;  the  error  prone 

nature  of  theorem  proving  suggests  that  all  proved  programs  should  be 
tested  EGer76l.  Seller  LGel781  has  attempted  to  combine  testing  and 
formal  verification  by  using  testing  to  simplify  proofs.  In  this 
case,  testing  is  used  to  verify  cumbersome  predicates,  e. g.  those 
involved  in  describing  array  ini t ial i zati on.  The  emphasis  in  all 
these  approaches  is  that  formal  verification  demonstrates  the  correct¬ 
ness  of  the  program  and  testing  supports  this  process. 

The  proposal  of  this  paper  is  that  testing  and  verification  may 
be  combined  in  quite  another  way,  resulting  in  conclusions  about  the 
absence  of  certain  errors  in  the  program  rather  than  the  total 
correctness  of  the  program.  This  is  best  explained  by  the  following 
testing  strategy: 

(1)  Identify  the  error  categories  of  interest. 

(2)  Identify  locations  within  the  program  where  the  errors  could 

occur. 

(3)  For  each  potential  error  location: 

a.  Derive  a  condition  under  which  an  error  will  be 
created  at  the  given  location  (the  creation  condition?. 
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b.  Derive  a  condition  under  which  an  error  will  pro¬ 
pagate  to  the  end  of  the  program  (the  propagation  condi¬ 
tion)  . 


c.  Produce  the  conditions  of  (a)  and  <b>  with  appropri¬ 
ate  test  data  points.  then  inspect  the  output.  If  the 
output  is  correct,  the  potential  error  does  not;  exist  at 
the  given  location. 


To  clarify  the  above  strategy,  a  precise  description  must  be  given  for 
"error  propagation.  '* 


1  1  &LH2JL  Propagation 


A  comprehensive  treatment  of  error  propagation  requires  viewing 
the  semantics  of  a  program  from  a  functional  and  structural  perspec¬ 
tive.  In  the  functional  approach  CLin793.  a  program  is  treated  as  a 
mathematical  function.  i.  e.  a  set  of  input-output  pairs.  In  the 
structural  approach,  a  program  is  treated  as  a  means  of  describing  a 
set  of  computations.  Informally.  a  computation  is  a  trace  of  a 
program's  execution.  The  set  of  computations  of  a  program  uniquely 
determines  the  program  function.  but  not  vice  versa.  The  ordered 
input-output  pairs  of  the  program  function  bear  no  necessary  correla¬ 
tion  to  the  program  variables.  To  provide  this  relationship,  the  con¬ 
cept  of  a  "program  state"  is  introduced. 


Definition  A  state  of  a  program  P  is  a  mapping 


s:  var  — >  value 


which  associates  a  unique  value  with  every  variable  of  P. 

An  initial  state  of  a  program  is  a  state  which  exists  before  any 
statements  of  P  have  been  executed  A  final  state  is  a  state  which 
exists  after  the  program  has  halted 


Some  variables  -Cx  >  of  an  initial  state  may  be  designated  as  pro- 

i 

gram  lnout  var  tab  1 es.  Their  cot esDond ing  values  <u  >  are  called  the 

i 

program  input.  Some  variables  <y  >  of  an  initial  state  may  be  iesig- 

i 

nated  as  program  output  var lab les  Their  corresponding  values  <v  > 

l 

upon  program  termination  are  called  the  program  output.  If  no  vari¬ 
ables  are  explicitly  designated  as  input  or  output,  then  all  variables 
are  considered  both  input  and  output 


-21- 


The  program  function  that  a  program  P  computes,  denoted  by  CP3. 
is  therefore/ 

CP3  *  -C<u»v)  !  v  is  tha  output  of  P  on  input  u» 

where  u  and  v  are  ordered  sets  of  values) 

Any  arbitrary  program  segment  P  implicitly  defines  a  program  function 
with  all  variables  designated  input  and  output.  For  the  purposes  of 
this  paper/  “program"  and  “program  segment"  ara  used  interchangeab ly 
unless  otherwise  stated. 

Each  approach  to  program  semantics  has  certain  advantages  and 
d isadvantages  for  error  analysis  (see  Section  4.3).  For  error  propa¬ 
gation/  a  functional  semantics  enables  a  clear  definition  of  the  con¬ 
ditions  under  which  an  error  in  an  initial  state  will  propagate  to  a 
final  state. 

Definition  Let  y  be  in  the  range  of  the  function  f.  A  1  eve  1  set  of  f 
is 

D  *  <x  !  f  ( x )  *  y>/ 

Y 

for  some  element  y  in  the  range  of  f. 

Definition  A  propagation  condition  8  of  a  function  f.  is  a  predicate 
defined  on  the  domain  of  f  satisfying  the  following; 

For  All  x  O  y  in  dom<f). 

B(x)  and  8(y>  implies  f(x>  O  f(y). 

All  domain  elements  of  f  which  satisfy  a  propagation  condition  B  pro¬ 
duce  different  members  of  the  range  of  f  >  i.  e  >  they  fall  into  dif¬ 
ferent  level  sets  of  f.  This  is  not  to  say  that  each  pair  of  domain 
elements  from  different  level  sets  necessarily  satisfy  B.  It  should 
also  be  noted  that  there  may  be  more  than  one  propagation  condition 
for  a  function. 

The  concept  of  a  propagation  condition  explains  step  (3>  of  the 
above  strategy.  Suppose  a  program  P  is  divided  into  two  parts#  F  and 
Q  with  CPI  *  CQloCRl.  Suppose  further  that  R  and  G  are  correct  and 
that  R'  is  an  erroneous  mutant  of  R.  To  ascertain  that  R  is  indeed 
the  correct  version  and  not  R'.  execute  P  on  an  arbitrary  input  s  If 
CR3(s)  can  be  shown  to  be  different  from  CR'3(s)  and  both  CR)<s>  and 
CR'l(s)  satisfy  a  propagation  condition  for  CG3i  then  the  functional 
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error  from  R'  will  propagate  through  G.  If  the  propagation  condition 
is  satisfied^  inspecting  the  output  allows  two  conclusions.  Not  only 
is  it  known  that  P  is  correct  for  s  (any  testing  strategy  would  have 
demonstrated  this)),  but  it  is  also  known  that  P'  (P  with  R'  replacing 
R)  is  not  correct.  If  P'  were  the  correct  version.  P  would  have  been 
incorrect  on  input  s.  Therefore,  a  potential  error  (that  of  substi¬ 
tuting  P  for  P')  has  not  occurred.  If  all  potential  errors  were  elim¬ 
inated  in  this  manner,  the  program  would  be  proven  correct  for  all 
input  data. 

A  computational  semantics  is  appropriate  for  analyzing  the  state 
transf ormat i ons  that  occur  as  a  program  executes. 

Definition  A  computation  point  for  a  program  P  is  an  ordered  triplet, 
c  *  (n.  i.  s)  where 

n  is  a  statement  number  of  a  statement  in  P, 

i  is  the  iteration  count,  the  number  of  times  that 
statement  n  in  P  has  been  executed. 

s  is  a  state  of  the  program  P. 

This  definition  includes  the  iteration  count  to  allow  an  isolated  com¬ 
putation  point  to  be  identified  with  a  particular  statement  as  well  as 
with  a  particular  execution  of  that  statement. 

Definition  A  computation  for  a  program  P  is  a  sequence  of  computation 
points  representing  the  execution  of  P  along  any  feasible  path  of  P 
A  subcomoutation  for  a  program  P  is  a  subsequence  of  a  computation  of 
P. 


A  few  comments  are  in  order  concerning  the  preceding  definitions. 
First.  the  level  of  detail  for  computations  could  be  increased.  Com¬ 
putation  points  could  contain  the  entire  history  of  the  execution  of 
the  program,  including  all  the  register  loads,  comparisons,  etc. 
Second,  a  computation  of  a  segment  P  '  of  a  program  P  is  not  neces¬ 
sarily  a  subcomputation  of  P.  CP  '  1  may  be  defined  for  inputs  that  may 
never  occur  as  the  result  of  earlier  execution  in  P;  therefore,  ?'  may 
have  computations  that  do  not  occur  as  subcomputations  of  P.  Third, 
computations  are  defined  only  for  feasible  paths  Non-feasibie  patns 
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could  n »v*r  be  executed  and  therefore  have  no  corresponding  computa¬ 
tions. 

Computations  and  computation  points  facilitate  the  discussion  of 
error  creation.  Incorrect  code  must  be  executed  for  a  functional 
error  to  occur.  Thus,  the  computation  for  a  functional  error  contains 
the  information  necessary  to  locate  the  error.  An  error  may  be 
"created"  in  one  of  two  ways.  First<  an  incorrect  statement  may  pro¬ 
duce  an  incorrect  intermediate  state.  This  state  is  incorrect  in  that 
the  correct  statement  would  have  produced  a  different  state.  Second/ 
the  execution  of  an  incorrect  statement  may  lead  to  an  incorrect  suc¬ 
cessor  statement  being  executed. 

Any  distinguishing  characteristic  of  a  correct  computation  may  be 
used  to  decide  if  an  arbitrary  computation  is  incorrect.  For 
instance/  suppose  that  a  final  state  is  only  obtainable  by  a  computa¬ 
tion  of  length  greater  than  n.  Any  computation  of  length  less  than  n 
may  then  be  rejected  as  incorrect.  The  process  of  rejecting  a  compu¬ 
tation  may  be  viewed  as  applying  a  “characteristic  function"  to  the 
computation.  This  function  selects  a  subset  of  the  computation  points 
from  the  computation  and  then  evaluates  an  expression  on  that  subset. 
The  following  defines  the  format  for  two  classes  of  characteristic 
functions. 

nit ion  For  any  computation  C  *  <c  #  ...  / c  )  for  a  program 

1  J 

expgn  on  Q.  denotes  the  value  CexpICsJ 

where  s  is  the  state  of  the  last  computation 
point  for  statement  number  n  in  C. 

If  a  computation  point  for  statement  number  n 
does  not  exist  in  Cz 
then  exp@n  is  undefined. 

<  b  )  e  xo<gh  i  s  t  <  n  >  on  £.  denotes  the  k-tuple. 

<Cexp3<s  >.  Cexp](s  ).  ...  >Cexpl(s  ) 

1  2  k 
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where 

k  is  the  numbtr  of  occurrtnctf  in  C  of  statement  number  n  and 

si  ...  >  s  are  obtained  from  the  k  computation  points 

1  k 

in  C  for  statement  number  n. 

If  k  ■  0  then  expShist(n)  is  undefined. 

If  k  *  1  then  expfthistln)  *  expSn. 

In  the  expressions  expfin  and  expShist<n),  n  and  h  i  s  t  (n  )  are  called  the 
computation  point  spec  if lers  ,  or  more  simply,  the  specifiers  for  the 
expression  exp. 

The  specifiers  are  restricted  to  selecting  either  the  last  computation 
point  or  all  computation  points  for  a  particular  statement.  This  is 
because  the  primary  concern  is  the  effect  a  given  statement  has  on  a 
computation. 

An  error  creation  cond i tl on  applied  to  a  state  of  a  computation 
point  tells  whether  the  succeeding  computation  point  is  in  error.  A 
creation  condition  is  defined  for  a  class  of  mutants  of  the  correct 
construct.  All  states  s  which  satisfy  the  creation  condition  are 
transcendental  CRou81J  for  the  class  of  mutants*  i.  e.  when  presented 

with  a  transcendental  state  s.  no  two  mutants  from  the  class 

transform  s  into  the  same  state.  For  example,  for  the  class  of  poly- 
nomials  P  ( M )  with  positive  integral  coefficients  bounded  above  by  M< 
any  input  value  greater  than  M  +  1  is  transcendental  CRow813.  Ifi 

therefore#  an  expected  program  error  is  the  substitution  of  one  member 

of  this  class  for  another«  an  error  will  be  created  whenever  the  poly¬ 
nomial  is  evaluated  on  a  number  exceeding  M  ♦  1.  This  error  will  be 
reflected  in  the  next  computation  point  if  the  value  is  assigned  to  a 
variable.  If  more  computing  is  done  first#  as  in  the  case  of  a  com¬ 
parison!  this  error  may  be  canceled  and  have  no  effect  on  the  next 
computation  point.  It  is  here  that  the  detail  of  the  computation 
impacts  the  detection  of  structural  errors.  A  more  detailed  computa¬ 
tion  better  distinguishes  the  instances  of  error  creation  and  error 
cancel lation. 

Once  an  error  has  been  introduced  into  a  computation,  it  is  then 
necessary  to  describe  how  the  error  will  propagate  to  another  part  of 
the  computation.  To  do  this,  it  is  sometimes  desirable  to  relate 


functionally  the  values  of  expressions  at  two  different  points  in  the 
computation.  If  the  second  expression  evaluates  to  an  incorrect  value 
whenever  the  first  evaluates  to  an  incorrect  value>  then  any  error 
reflected  by  the  first  expression  will  propagate  to  the  second  expres¬ 
sion. 

Definition  Given  a  program  P  and  class  of  computations  S»  for  specif¬ 
iers  x  and  y> 

expl«x  Influences  exp2«y  an  S 

is  used  to  mean  the  following: 

For  all  C  and  D  in  S  for  which  expl&x  and  exp23y 
are  both  defined< 

if  explSx  on  C  O  explSx  on  D  then 

exp23y  on  C  O  exp23y  on  D. 

If  S  is  omitted.  it  is  assumed  to  be  the  set  of  all  computations  of  P. 

A  simple  example  will  illustrate  the  idea  of  influence.  Consider 
the  code: 

1  read  (i)j 

2  read  (y); 

3  while  y  <  10  do 

begin 

4  z  :  ■  z  +  y; 

5  y  :  *  y  +  1 
end ; 

6  write  <y»  z  > 

For  the  class  of  computations  S  in  which  y£2  *  0  for  the  above  code: 

(1)  zSl  influences  z@4  implies  that  the  output  of  the  loop  will  be 
different  for  every  input  value  of  z. 

<2>  u&2  influences  z®4  implies  that  the  output  of  the  loop  will  be 

different  for  every  input  value  of  y.  This  is  trivially  true  for 

the  class  S.  because  the  input  y-value  is  constant  for  S. 

<3)  z3hist<3)  influences  z34  implies  that  the  output  of  the  loop  will 

be  different  for  every  sequence  of  z  values  the  loop  can  com¬ 

pute. 

(4)  z®4  influences  z36  implies  that  an  error  in  z  upon  loop  exit  will 

propagate  to  the  output  statement. 


The  influence  dependences  for  a  program  cannot  be  determined 
algorithmically  because  this  could  require  deciding  if  arbitrary  loops 
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halt.  which  is  impossible  (see  Section  3).  When  such  dependences  can 
be  shown<  however,  propagation  conditions  mag  be  easier  to  prove.  For 
example,  suppose  xftnl  influences  y«n?  and  y®n2  influences  z #n3  for  a 
computation  C.  If  it  is  known  that  x  contains  an  incorrect  value  at 
line  nl.  then  z  contains  an  incorrect  value  at  line  n3<  and  the 
error  has  propagated. 

In  Section  4.  1  a  computation  error  was  defined  to  be  an  incorrect 
computation  along  a  particular  path.  To  understand  the  propagation  of 
a  computation  error,  the  following  definitions  are  given. 

Definition  Let  P  be  a  program  with  computation  C  in  which  computation 
point  cl  ■  <nl.  ii.  si)  precedes  c2  »  (n2.  i2.  s2).  The  lntermed iate 

code  determined  by  cl  and  c2  is  the  set  of  statements  of  P  executed 
between  nl  and  n2  in  the  computation  C.  The  intermediate  function 
determined  by  cl  and  c2  is  the  program  function  of  the  intermediate 
code. 

Definition  Let  C  =  (cO»  ...  .  cm.  ...  .  cn)  be  a  computation  for  a 

program  P. 

Let  R  be  the  intermediate  code  determined  by  cO  and  cm. 

Let  G  be  the  intermediate  code  determined  by  cm  and  cn. 

Suppose  R  is  incorrect  and  that  R*  is  a  correct  mutant 
of  R. 

Let  sO  be  a  valid  input  for  P  for  which  CR3  is  defined. 

Let 

CR 3 ( sO )  *  si  CR*3  <  sO )  *  si* 

CGKsl)  *  s2  CQ3(sl*>  *  s2* 

Let  exp  be  any  expression  over  the  program  variables. 

<a)  If  si  O  si*  then  R  has  created  a  state  error  si  for  si*  on  input 
sO. 

(b)  If  R  has  created  a  state  error  and  s2  O  s2*.  the  state  error  si 
for  si*  or ooaqates  through  G. 

(c)  If  Cexpl  (si)  O  Cexp3  (si*)  then  R  has  created  an  expression 
error  for  exp. 

The  following  theorems  are  trivially  true  by  substitution  of  the 
definitions  given  earlier  in  this  section.  They  are  given  here  to 
illustrate  how  the  concepts  are  related. 


-27- 


Theorem  A  state  error  si  for  t2  propagates  through  the  intermediate 
code>  P>  iff  si  and  s2  are  in  different  level  sets  for  CPI. 

Proof  Consider  state  error  si  for  s2.  If  si  and  s2  are  in  different 
level  sets  of  CP3,  then  CPI  <sl>  O  CP3  (s2).  and  the  state  error  pro¬ 
pagates  through  P.  If  the  state  error  propagates  through  P.  then  CP3 
<sl>  O  CP3  <s2>.  Thus*  si  and  s2  are  in  different  level  sets  of  P. 


Theorem  Let  P  be  the  intermediate  code  delimited  by  statements  nl  and 

n2.  Let  C  and  D  be  two  computations  of  P  with  different  initial 

states  s  and  s  .  respectively. 

C  D 

(1)  expl@nl  influences  exp2Sn2  for  all  computations  executing  P, 

iff 


(2)  If  s 


C 

then 


s  are  in  diff»r*nt  level  sets  of  Cexpl3 
D 

Cexp23CP3  <s  )  O  Cexp23CP3  (s  > 

J  k 


Proof  Assume  (1).  If  s  and  s  are  in  different  level  sets  of  Cexpl3/ 

CD 

then  Cexpl3  (s  )  O  Cexpl3  <s  ).  By  definition  of  influence!  Cexp23 
C  D 

CP3  <s  )  O  Cexp23  CP3  (s  >  and  (2>  follows  immediately. 

C  D 

Assume  (2).  If  Cexpl3  <s  )  O  Cexpl3  Cs  )  then  s  and  s  are  in  dif- 

C  D  C  D 

ferent  level  sets  of  Cexpl3.  Combining  this  with  (2>  yields  Cexp23 

CP3  <s  )  O  Cexp23  CP3  (s  ).  and  (1)  follows  immediately  by  the  defin- 
C  D 

ition  of  influence. 


It  has  been  noted  already  that  there  can  be  more  than  one  propa¬ 
gation  condition  for  a  function  f.  This  is  clear  because  every  propa¬ 
gation  condition  is  Implicitly  defined  by  its  propagation  seti  the  set 
of  all  values  that  satisfy  the  propagation  condition.  Every  subset  of 
a  propagation  set  is  a  propagation  set  so  there  are  many  propagation 
cond  i  tions. 

Theorem  Let  B  be  a  propagation  set  for  a  function  f.  No  two  members 
of  B  are  in  the  same  level  set  of  f. 

Proof  Let  x  and  y  be  different  elements  of  3  By  definition  of  a  pro¬ 
pagation  set!  f ( x )  O  f<y>.  Thusi  x  and  y  can  not  be  in  the  same 
level  set  of  f. 
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What  character i sties,  if  any.  qualify  on#  propagation  condition 
to  to#  declared  "better"  than  another?  On#  characteristic  is  the  "gen¬ 
erality"  of  the  propagation  condition  or  set.  Recall  that  a  potential 
error  can  be  elimin»ted  only  if  a  state  error  can  be  detected  when  it 
occurs.  A  state  error  si  for  s2  can  be  detected  only  if  both  si  and  s2 
satisfy  the  propagation  condition/  i.  e.  both  si  and  s2  are  in  the 
propagation  set.  Therefore/  increasing  the  size  of  the  propagation 
set  increases  the  number  of  potential  errors  that  can  be  eliminated. 
If  B  is  any  propagation  set  for  the  function  f.  it  satisfies  the  fol¬ 
lowing  set  equation: 

B  -  <x  I  For  All  y  O  i  in  B,  f(x>  O  f(y)>. 

Clearly<  the  smallest  propagation  set  is  the  null  set.  Also  by  the 
last  theorem  above/  all  members  of  B  are  in  different  level  sets  of 
f.  Since  two  propagation  sets  may  both  be  infinite,  describing  one  as 
larger  than  an  other  is  inappropr iate.  the  term  "most  general"  may  be 
used  instead.  Thus,  a  most  general  propagation  set  of  a  function  f  is 
a  propagation  set  that  contains  one  member  from  each  level  set  of  f. 
Associated  with  this  set  is  a  most  general  propagation  condition. 

A  second  character istic  of  a  propagation  condition  is  that  of 
applicability.  Until  now  it  has  been  implicitly  assumed  that  the 
function  will  be  applied  to  all  domain  elements.  Since  some  values 
may  not  be  feasible  as  input  to  an  intermediate  program  function,  the 
propagation  set  should  contain  as  many  feasible  values  as  possible 
Propagation  conditions  may  therefore  be  compared  on  the  basis  of  how 
many  feasible  values  they  contain. 

A  final  character istic  of  a  propagation  condition  is  that  of 
efficiency.  If  two  propagation  conditions  are  equivalent  in  te-ms  of 
app 1 icab i 1 i ty  and  generality,  they  may  be  d i f f eren t iated  on  the  basis 
of  their  ease  of  evaluation.  This  efficiency  characteristic  can  not 
increase  the  number  of  errors  that  can  be  theoretical  ly  eliminated, 
but  it  may  increase  the  number  of  errors  that  can  be  practically  elim¬ 
inated  in  an  implementation. 

1  2-  Testing  Accumulation  Programs 

Accumulation  programs  CBas303  are  the  class  of  programs  satisfy¬ 
ing  the  following  schema  and  restrictions: 


"7 

\ 


i 
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=  zOi 


while  Not  <y  In  Null(Y)  >  tio 
begin 

z  =  acc  <  z.  k  < y  >  > ; 
y  :  =  h  ( y  ) 


u  denotes  all  program  variables  requiring  definition  for  the  loop 
body  to  be  defined  on  any  iteration.  z  does  not  enter  into  the 
computation  for  h(y>  in  line  4.  Thus*  z  does  not  influence  the 
flow  of  control  of  the  loop. 

The  functionalities  of  h.  k*  acc  and  Q  are. 


C  k  3  :  Y  — >  Dbase 
Caccl  Z  x  Obase  — >  Z 
CQ]  :  Z  x  Y  — >  Z  x  Y 

where  Y  denotes  the  values  the  variable  u  may  assume*  Z  denotes 
the  values  the  variable  z  may  assume*  and  Dbase  denotes  th«  range 
of  Ckl.  Null<Y)  denotes  all  values  of  y  which  terminate  the  loop. 

The  following  is  an  example  of  an  accumulation  loop: 
z  :  *  0,  y  .  =  0; 
while  y  <  10  do 
begin 

z  :  »  z  +  y» 
y  :  «  y  +  1 


The  variable  z  accumulates  information  as  the  loop  iterates  through 
different  values  of  y.  In  this  accumulation  program  we  have* 

Y*  Z*  Dbase  =  Natural  Numbers 
Null(Y)  =*  -Cy  ;  y  <*  10> 

Chi  »  successor  function 


Caccl 


identity  function 
»  addition  function 


Clearly,  the  restrictions  are  satisfied. 

In  an  accumulation  loop,  the  restriction  on  z  allows  the  computa¬ 
tion  for  y  to  be  separated  from  the  computation  for  z  The  above 
accumulation  loop  schema  G  is  therefore  functionally  equivalent  to  the 
following  transformed  schema  QQ: 


v 
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00:  n  . «  0;  z  :  *  zOi 

while  Not  <  y  In  NuIl<Y)  )  do 
beg  in 

RhCnl  . »  uj 
n  :  »  n  +  li 
y  :  »  h  (  y  ) 

end; 

j  :  ■  0; 

while  j  <  n  do 
begin 

z  : *  acc  ( z»  k  (RhC  j3 )  ) 
J  :  *  J  +  1 

end 


The  first  loop  in  QQ  computes  the  same  intermediate  values  of  y  that 
Q  computes,  but  stores  them  in  an  array  Rh  (results  of  h).  The  second 
loop  in  QQ  processes  the  array  element-by-element.  extracting  the 
desired  information  (via  k)  and  accumulating  it  into  z  (via  acc) 


Lemma  The  accumulation  loop  0  computes  the  function: 

CQKzO.yO)  «  (z.y)  such  that 

z:  =*(  Cacc  3  (  Cacc  3  < .  .  .  (CaccKzO.  C  k  3  ( yO)  > .  C  k  3  (  Ch  3  ( yO> ) ,  .  .  . Ck3 ( thl"  <yO)  > 

n 

y . »Ch3  (yO) 

n 

where  n  is  the  smallest  value  such  that  ChD  (yO>  is  in  Null(Y) 


To  be  able  to  test  accumulation  loops  for  possible  errors,  it  is 
necessary  to  understand  how  errors  propagate  through  an  accumulation 
loop.  Suppose  H  is  a  class  of  mutants  of  h;  we  say  that  H  is  an  »rr or 
cateaoru  for  h.  Clearly,  a  substitution  of  h '  in  H  for  h  may  affect 
the  computation  of  the  loop.  If  it  can  be  shown  that  the  substitution 
results  in  a  "positive  error"  CChj'Cy)  >  Ch3(y>>  during  every  itera¬ 
tion  of  the  loop,  and  that  this  positive  error  accumulates  into  the 
variable  z>  then  z  will  necessarily  be  incorrect  on  loop  termina¬ 
tion.  The  following  theorem  states  sufficient  conditions  to  guarantee 
that  such  accumulation  does  occur 


Definition  For  two  k-tuples. 

A  *  -Ca  .  .  a  >  and  B  ■  <b  .  .  .  b  >. 

Ik  lk 

A  <»  B  iff  For  all  i.  a  b  . 

i  i 

This  definition  is  applied  recursively  if  s  and  b 

l  i 

are  sets  of  the  same  cardinality  If  A  O  S  then  3  ma » imi : es  A 
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Theorem  For  the  accumulation  program  schema  Q  sbove<  let 

H  denote  the  error  category  of  h 

Y  denote  the  set  of  y-values  for 

all  possible  iterations  of  the  loop. 

Z  denote  the  set  of  z-values  for  all  possible 
iterations  of  the  loop. 

0  denote  a  subset  of  the  domain  of  Ck2. 
k 

Assume  Y»  Zi  and  D  each  have  a  partial  ordering  operator.  <«. 

k 

Let  Q'  result  from  substituting  h'  in  H  for  h  in  Q.  Let  C  be  the  com¬ 
putation  of  Q  an  input  yO  and  C'  be  the  computation  of  Q'  on  input  yO. 
The  following  conditions  are  sufficient  to  guarantee  that  an  expres¬ 
sion  error  for  z  occurs  after  executing  Q  and  O'  on  input  yO. 

(1)  Q  and  Q'  iterate  the  same  number  of  times  <>  1). 

(2)  For  each  iteration.  Ch3<y)  <*  Ch'3(y). 

or 

For  each  iteration.  Ch3<y)  >»  Ch'l(y). 

<3)  For  at  least  one  iteration.  Ch3<y>  O  Ch'3(y> 

(4)  C  k  3  is  strictly  monotonic  on  D 

k 

(5)  For  each  iteration  both  Ch3(y>  and  Ch'lCy)  are  members  of  D  . 

k 

<6)  Caccl  is  strictly  monotonic  in  both  variables 
Proof 

Let  Rh  denote  the  tuple  containing  the  initial 
and  intermediate  y-values  computed  by  Q. 

Let  Rh  '  denote  the  tuple  containing  the  initial 
and  intermediate  y-values  computed  by  G  ' 

By  conditions  (1)  and  (2)  of  the  theorem,  assume  without 
loss  of  generality  that  Rh  '  maximizes  Rh. 

By  conditions  (3).  (4)  and  (5).  Ck3<Rh')  maximizes  Ekl(Rh). 

Let 

Ckl(Rh)  =  (z  .  ...,:  )  and  Ckl(Rh')  ■  (z  ...  .  z 

In  In 

For  schema  G  (by  the  preceding  lsmma). 


r 
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z  *  Cacc  3 (  Cacc  J <  ...  ( Cac c 3 ( z  .  i  )>  z  ) -  z  > 

0  12  n 

and  for  schema  Q'< 

z  ■  Cacc  3  <  Cacc  J  <  ...  <  Cacc  3  <  z  z  ' ) .  z  ')«  ...)/  z  ') 

0  12  n 

3g  repeated  application  of  condition  (6)  of  the  theorem, 
we  mag  conclude  that  an  expression  error  occurs  for  z. 

In  the  following  example,  the  above  theorem  will  be  used  to 
in  testing  a  program  containing  an  accumulation  loop. 


f 


aid 
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S.  2-  Example 

The  strategy  and  the  theory  developed  are  now  applied  to  a  pro¬ 
gram  which  computes  the  area  under  a  curve  by  rectangular  approxima¬ 
tion.  Test  data  is  not  included  due  to  the  generality  of  the  polyno¬ 
mials  in  the  program. 

program  calcarea  (input,  output), 
var  a.  b.  incr>  area,  value  real. 

begin 

1  read  (a. b. incr);  <incr  >  0> 

2  value  : *  p 1 (a )  > 

3  area  :  ■  Oi 

4  while  a  +  incr  <*  b  do 

begin 

5  area  : *  area  +  value  *  incr; 

6  a  ;«  a  +  incr; 

7  value  :=p2(a> 
end; 

8  incr  : *  b  -  a; 

9  if  incr  0  then  begin 

10  area  : *  area  +  value  *  incr; 

11  writeln  <  'area  by  rectangular  method:  area) 

end  else 

12  writeln  ('illegal  values  for  a='.  a.  '  and  b=',  b) 
end. 


Using  the  strategy  given  above,  we  have  the  following. 


SlfP  1 


error 


(1)  Incorrect  polynomials  pi  and  p2.  both  of  which  are  members  of 
+ 

P  <M),  the  set  of  polynomials  of  the  form 

1  n 

a  •♦■ax  ♦  ...■♦■ax 

0  1  n 

wh  er  e 

0  O  a  O  M  and  a  is  an  integer, 
i  i 

P  (M)  is  an  important  category  of  polynomials  for  which  transcen¬ 
dental  testing  is  appropriate  [Row811  (See  Section  5.  1.  of  this 

paper.  ) 
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(2)  Incorrect  comparisons.  A  wrong  comparison  operator  is  used. 

<3>  Incorrect  accumulation.  Wrong  variables  or  wrong  operators  are 
used. 

(4)  Incorrect  initial i zation. 

(3)  Incorrect  output.  An  incorrect  variable  is  used  in  an  output 
statement. 

Step  2  Identi £ u  the  locations  wh ere  error s  could  occur. 

Six.g.r  urn? 


1. 

Inc  orr ec  t 

Polynomials  <IP) 

2. 

7 

2. 

Incorrect 

Comparisons  <IC) 

4, 

9 

3. 

Incorrec  t 

Accumulation  (IA) 

3. 

6,  10 

4. 

Incorrect 

Initialization  <II) 

2. 

3.  8 

5. 

Incorrect 

Output  <IO) 

11.  12 

Step  2  Derive  Creation  and  Propagation  Cond itions 

A  creation  condition  and  propagation  condition  are  now  provided 
for  each  of  the  error  locations  given  in  step  2.  A  creation  condition 
guarantees  that  the  error,  if  present,  produces  a  state  error  It 
must  be  satisfied  just  before  the  potentially  erroneous  statement  is 
executed.  A  propagation  condition  guarantees  that  this  state  error 
propagates  to  an  output  statement.  It  is  evaluated  immediately  after 
the  potentially  erroneous  statement  is  executed. 

Each  error  is  designated  by  an  error  category  abbreviation,  fol¬ 
lowed  by  a  line  number,  e.  g.  .  IP  7  designates  "Incorrect  Polynomial  at 
line  7.  " 

Error  Ifl  U*  i£ 

Creation  Cond 1 t ion  —  all  variables  have  different  values. 

Propagation  Cond l t i on  —  true 

Error  IA  10 

Two  cases  are  considered. 

(1)  An  incorrect  accumulation  operator  has  baen  used;  perhaps  ♦ 
should  have  been  -,  *.  or  /. 
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Error  Creation  Cond 1  tion 

-  value  *  incr  O  0 

*  area  >  2  and  value  *  incr  >  2 

/  value  *  incr  >  1  and  area  >  1 

Prooaaation  Cond i tion  —  true. 

(2)  An  incorrect  accumulation  base  element  (value  *  incr)  has  been 
used. 

Error  Creation  Condition 

Off  by  a  constant  true 

Off  by  a  factor  value  *  incr  O  0 

Propaoat i on  Condition  —  true. 

Error  IC  9 

Perhaps  the  >=  should  have  been  another  comparison  operator. 

Creat i on  Condition 
incr  =  0 
true 

incr  O  0 
incr  >  0 
incr  O  0 

Prooaaation  Cond i t i on  —  true. 

Error  11  8 

Three  cases  are  considered. 

(1)  The  wrong  variable  may  occur  on  the  left  hand  side  of  tne  assign¬ 
ment.  This  error  is  particularly  nasty  because  any  resulting 
functional  errors  depend  upon  whether  the  incorrect  variable  is 
used  ("live")  or  not  used  ("dead”)  in  the  remaining  computation. 
If  the  incorrect  variable  is  dead<  then  a  functional  error  can 
occur  only  when  the  correct  variable  influences  the  output  of  the 
program.  Data  flow  analysis  will  detect  this  type  of  error. 
Additionally/  if  the  incorrect  variable  is  live/  then  a  func¬ 
tional  error  will  also  occur  whenever  the  incorrect  variable 
influences  the  output  of  the  or ogr am.  Since  data  flow  analysis 


Error 

> 

< 

O 

m 

o 


can  isolate  the  first  error/  the  second  error  n  considered  here. 
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Error 

Substitution  of  a  live 
variable  for  incr  on  the 
left  side  of  the 
assignment  statement 


:r eat  1  on  Condition 


Cincr  O  b-a334 


Two  propagation  conditions  are  given,  the  first  representing  a 
domain  error  and  the  second  representing  a  computation  error. 

Cincr  >=  0  and  <b-a)  <  0334 
Cincr  >=  0  and  value  O  0334 

Recall  that  the  specifier  34  implies  evaluation  of  these  expres¬ 
sions  at  loop  termination.  These  two  conditions  mag  be  combined 
yielding: 

Cincr  0  and  ((b  <  a)  or  value  O  0)334 

(2)  A  wrong  variable  may  have  been  substituted  on  the  right  hand  side 
of  the  assignment. 


Error 

variable  'a'  is  an 
incorrect  variable 


variab le  'b  '  is  an 
incorrect  variable 


. on  Condi tlon 


Creation  Condition 

All  variables  have  different 
values  from  '  a  ' . 


All  variables  have  different 
values  from  '  b  ' . 


Same  as  in  case  < 1 ) . 


<3)  An  incorrect  constant  expression  may  have  been  used  on  the  right 
hand  side  of  the  assignment. 


&LL2Z. 

Off  by  a  constant 
Off  by  a  factor 


Creation  Condi t i on 


Co -a  00334 


—  Same  as  in  case  (1). 


Error  IP  7 


a36  >  M  +  1  for  each  iteration  of  the  loop  and  the  loop  executes  more 
than  once. 

Recall  that  all  values  greater  than  r*  +  1  are  transcendental  for  all 


35  * 
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polynomials  in  P  <M).  See  Section  S. 1  of  this  paper. 

Propagation  Cond ition  —  true. 

The  simplicity  of  this  propagation  condition  is  guaranteed  by  the 
accumulation  loop  theorem  from  the  previous  section.  In  order  to  show 
thisi  it  must  be  shown  that  the  loop  is  an  accumulation  loop  and  that 
it  satisfies  the  conditions  of  the  theorem.  If  this  is  the  case,  then 
the  theorem  states  that  there  will  be  an  expression  error  for  area  on 
loop  exit.  Furthermore,  on  the  last  iteration  of  the  loop,  a  +  incr 
<=  b  on  loop  entry,  so  b  -  a  >=  0  on  loop  exit.  But.  b  -  a  >=  C  is 
the  propagation  condition  which  ensures  that  area&8  influences 
areaQIO.  any  error  in  valueSlO  merely  increases  the  magnitude  of  the 
error  in  area@10.  Thus,  if  the  loop  satisfies  the  conditions  of  the 
theorem,  the  given  error  will  propagate  to  the  output  statement  in 
line  11. 

To  show  that  the  loop  is  an  accumulation  loop,  we  note  the  fol¬ 
lowing  correspondences  to  the  schema  G: 

y  corresponds  to  (a, b, incr, value ) 

Null(Y)  *>  -C  <  a,  b,  incr.  value  )  !  a  +  incr  >  0> 

z  corresponds  to  area 

Ch  ]( a,  b .  incr ,  value  )  *>  (a+incr,  b,  incr,  p2(a+incr)> 

C  k  ]  <  a, b,  incr, value)  *  value  *  incr 
Caccl(area,  x)  *  area  +  x 

Clearly,  area  does  not  enter  into  the  computation  of  h  Also, 

Y  -  Real  x  Real  x  Real  x  Real 
Z  *  Real 
Obase  -  Real 

D  =  Real  x  Real  x  <incr>  x  Real,  i. e  D 

k  k 

has  a  fixed  value  of  incr. 

To  show  that  the  loop  satisfies  the  conditions  of  the  theorem,  we 
first  note  that  the  error  category  is 

H  *  <h '  !  C h 3  '  ( a, b ,  incr, va 1 ue )  =  (a+incr. b ,  incr. CP  '  3 (a+incr )) >, 

4* 

where  P'  is  a  member  of  P  (fl). 


The  theorem  conditions  are  therefore  satisfied. 
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Substitution  of  h'  (from  »rror  category  H)  for  h  does  not  change 
the  number  of  times  the  loop  executes.  since  the  computation  for 
a ,  incr.  and  b  remain  unaffected. 

Provided  the  creation  condition  is  satisfied  for  all  iterations 
of  the  loop> 

Ch'3<y>  >■  Ch3(y)  for  each  iteration 
or 

Ch3(y>  >■  Ch'3(y)  for  each  iteration. 


If  this  were  not  the  case*  then  for  some  tl  and  t2 . 
tals  for  h  and  h'» 

Ch3<tl>  >  Ch'3(tl)  and  Ch'3(t2)  >  Ch3<t2). 


transcenden- 


Since  the  functions  Ch3  and  Eh '3  are  continuousj  there  must  be  a 
point  t  between  tl  and  t2  such  that  Eh3(t)  a  Ch'3(t>.  Thus,  t 
is  not  a  transcendental.  But  this  is  a  contradiction  since  all 
points  greater  than  M  +  1  are  transcendental  and  t  >  tl  >  M  +  1. 
Thus  the  functions  Ch3  and  Eh '3  do  not  cross  on  any  point  beyond 
M  +  1  and  one  always  maximizes  the  other  on  this  interval. 


If  the  creation  condition  is  true, 
iterations  of  the  loop. 


Eh'3(y>  O  Ch  3  <  y )  for  all 


(4)  C  k  3  is  strictly  monotonic  on  D  since  incr  is  a  nonzero  constant 

k 

for  the  loop. 

(5)  For  all  iterations  Ch3  and  Ch'3  produce  members  of  D  This  is 

clear  because  all  members  of  H  vary  only  in  their  computation  for 
value,  leaving  the  computation  a.  6.  and  incr  unaffected. 

(6)  Cacc3  is  strictly  monotonic  in  area  and  value*incr. 

Error  IA  it 

A  wrong  h  function  may  be  implemented.  Two  instances  are  considered. 


11 


i 

If 


ET.r.or 

Off  by  a  constant  >  0 
Off  by  a  factor  >  1 


CreatiOT 


true 


incr  >  0 


—  true 


For  either  error,  a  >  b  upon  loop  exit,  as  can  be  seen  from  the  last 
iteration  of  the  loop.  Thus.  Cincr  <  03®8  causes  a  domain  error  with 
statement  12  executed  in  place  of  statement  11. 

Error  IA  2 

A  wrong  k  function  may  be  implemented. 


Creat  ion 


Off  by  a  constant 
Off  by  a  factor 


/a lue  *  incr  O  0 


—  trui 


Since  neither  of  these  two  errors  affect  the  monotonicity  of  k,  the 
argument  used  for  IP  7  holds. 


•  *  '•an.  i* 
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Error  IC  4 

In  place  of  the  <>.  another  comparison  may  have  been  substituted. 

Error  Creation  g on fl Ilian 

<  a  +  incr  ■  b 

>  true 

>»  a  +  incr  Ob 

=  a  +  incr  <  b 

O  a+incr>=b 

Propagation  Cond  i  t  i on 

The  substitution  of  <  for  is  an  excellent  example  of  a  mutant 
being  unobviously  equivalent  to  the  given  program.  This  is  discovered 
while  attempting  to  find  the  propagation  condition  for  this  "error." 
The  substitution  may  cause  the  loop  to  halt  one  iteration  too  soon, 
with  termination  guaranteeing  that  incr  is  unchanged  by  line  8. 
Thus.  the  execution  of  line  10  computes  the  same  value  for  area  as  an 
additional  execution  of  line  5.  An  additional  execution  of  the  loop 
would  result  in  Cincr  ■  0138.  so  lvalue  *  incr=  0]@9.  Hence,  statement 
10  would  not  change  the  value  of  area  computed  by  the  additional  exe¬ 
cution  of  the  loop.  Thus.  the  substitution  of  <  for  O  is  an 
equivalent  mutant  and  the  propagation  condition  is  fa  1 se. 

For  the  other  four  substitutions  to  influence  an  output.  a 
created  error  must  propagate  to  statement  11  or  12.  A  propagation  con¬ 
dition  of  a  +  2*incr  <  b  is  sufficient,  since  this  guarantees  the  loop 
will  execute  at  least  twice,  causing  the  loop  computed  value  for  area 
to  be  strictly  greater  than  the  tail  appro x imat ion  in  lines  8-10 

Error  U  3 

Suppose  area  should  have  been  initialized  to  another  constant 
Creation  Condition  —  true 
Pn9gtflt*l9,n  Condition  —  a  <«  b 

With  this  condition  satisfied.  area@3  influences  areaSlO.  Therefore, 
if  an  error  has  occurred  at  line  3.  an  incorrect  value  for  area  will 
be  printed  by  line  10. 


Se ran  If.  £>  LL  £ 

Creation  Condition 


a  >  M  +  1 
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ProBaaatlon  Condition  --  •  <  b  <  a  +  incr 
This  condition  forces  the  program  to  follow  the  path* 
p  -  (1.  2,  3.  4f,  8.  9t.  10.  11). 

skipping  the  loop  body.  £a  <  b 191  ensures  that  Cincr  >  0199.  so 

value€10  influences  areatlO.  Since  value810  *  value92  for  path  p. 
valuet2  influences  areatlO. 

1-  ±.  Ag.a.Lulag  SiLt  atXJUm 

It  should  be  noted  that  errors  have  been  eliminated  in  a 

"bottom-up"  fashion.  Recall  that  the  justification  for  the  strategy 
assumed  that  a  program  could  be  separated  into  two  segments  R  and  Q> 

with  Q  being  correct.  Certainly  if  R  is  the  whole  program.  Q  is 

trivially  correct.  As  errors  are  eliminated  from  the  end  of  R.  then 
Q  can  expand  to  contain  this  "correct"  code.  It  is  possible,  however, 
that  R  contains  two  structural  errors  that  mask  one  another,  with  the 
first  preventing  discovery  of  the  second  on  certain  paths,  and  vice 
versa.  For  example. 

1  x  :  «  3  *  y; 

2  z  :  »  x  -  4; 

3  write  (z); 

Suppose  the  error  category  of  interest  is  "incorrect  constants. " 

Clearly,  both  the  creation  and  propagation  condition  are  true  for  this 
error  in  lines  1  and  2;  any  state  will  produce  an  incorrect  state  and 
the  error  is  guaranteed  to  propagate  to  line  3.  Yet.  for  y91  =  4.  the 
following  program  is  equivalent  and  contains  the  incorrect  constants: 

1  x  :  *  2  *  yi 

2  z  :  *  x ; 

3  write  < z ) ; 

Testing  additional  paths  in  which  statements  1  and  2  are  not  cou¬ 
pled.  and  testing  a  coupled  path  with  more  inputs  are  two  ways  of 
reducing  the  impact  of  such  errors.  The  situation  is  similar  so  that 


r 
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o f  linear  domain  tasting  in  which  assignment  and  equality  blindness 
prevent  certain  predicate  errors  from  being  eliminated.  Here*  however, 
the  blindness  is  due  to  a  presumed  error  in  the  first  part  of  the  pro¬ 
gram.  rather  than  in  the  correct  operation  of  the  first  part  of  the 
program.  Evidence  exists  that  such  coupling  rarely  occurs  in  practice 
CBudSOl.  but  investigation  of  the  phenomenon  may  yield  greater  insight 
into  error  propagation.  Such  investigation  is  currently  under  way. 
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testing  and  verification  in  a  new  wag.  Some  of  the  advantages  are  as 
follows: 


(1)  In  structural  and  functional  testing  incorrect  code  mag  remain 
undetected  even  though  executed.  Consequently,  the  certainty  of 
the  results  is  difficult  to  determine.  The  strategy  given  here 
can  guarantee  that  certain  common  errors  are  not  present  in  the 
program.  Structural  and  functional  testing,  on  the  other  hand, 
only  guarantee  the  elimination  of  very  few  error  categories. 
Test  data  that  guarantees  the  elimination  of  certain  errors  in 
addition  to  satisfying  the  usual  functional  and  structural  cri¬ 
teria  is  necessarily  of  better  quality  than  test  data  that  issues 
no  guarantee.  Thus.  the  proposed  strategy  provides  a  means  of 
increasing  test  data  quality. 

<2)  When  quality  is  lacking,  the  tester  can  be  directed  to  specific 
lines  of  code  where  potential  errors  have  not  yet  been  elim¬ 
inated.  This  guidance  is  more  specific  than  possible  with  struc¬ 
tural  testing. 

(3)  The  proposed  strategy  is  more  efficient  than  mutation  testing  for 
killing  particular  mutants.  First,  the  actual  mutants  do  not  have 
to  be  generated  or  executed.  Second,  one  test  point  can  elim¬ 
inate  all  mutants  along  an  execution  path.  Third,  mutation  test¬ 
ing  provides  little  guidance  when  a  mutant  executes  correctly 
The  proposed  strategy  guides  the  data  selection  process  towards 
selecting  data  that  creates  the  state  in  which  an  error  could 
occur  and  in  which  it  will  propagate. 

(4)  The  proposed  strategy  is  based  upon  the  function  testing  sug¬ 
gested  by  Foster  and  Howden.  from  which  the  concept  of  a  creation 
condition  may  be  inferred.  The  inclusion  of  a  propagation  condi¬ 
tion  in  the  strategy  provides  greater  assurance  that  created 
errors  will  not  be  canceled  by  the  remaining  execution  of  the 
program. 

<3)  The  proposed  strategy  extends  linear  domain  testing  by  removing 
two  restr i c tions.  First,  it  need  not  be  assumed  chat  the 
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predicates  to  be  tested  are  a  linear  combination  of  the  input 
variables.  Second/  it  need  not  be  assumed  that  a  domain  error 
necessarily  produces  a  functional  error.  Indeed/  this  must  be 
proven  in  the  proposed  strategy. 

(6)  The  proposed  strategy  combines  testing  and  formal  verification  in 
a  new  way.  The  goal  is  to  force  the  program  to  inform  the  tester 
of  its  own  errors  through  testing.  Formal  verification  is  used 
to  support  this  process.  As  a  support  tool/  formal  verification 
is  used  in  a  restricted  capacity/  lessening  the  difficulty  nor¬ 
mally  encountered  in  formal  proofs  of  correctness. 

( 7 )  Weyuker  CWey813  has  argued  that  a  testing  strategy  should  use  all 
the  information  that  can  be  obtained  from  the  program/  the 
program's  specification/  and  the  errors  commonly  encountered. 
This  strategy  suggests  that  the  computation  of  the  program  itself 
is  another  important  source  of  information.  The  wealth  of  infor¬ 
mation  that  is  contained  in  the  computation  has  been  virtually 
untapped  by  structural  testing.  The  computation  of  a  program  on 
one  input  effectively  eliminates  a  huge  number  of  possible 
errors.  Greater  knowledge  of  these  eliminated  errors  would 
increase  our  confidence  in  a  program's  correctness. 

One  weakness  of  the  proposed  strategy  is  the  assumption  shared 
with  mutation  testing  that  errors  can  be  eliminated  one  at  a  time; 
i.  e.  two  errors  do  not  interact  in  such  a  way  that  each  error  prevents 
the  other  error  from  being  eliminated  by  the  strategy.  There  is  evi¬ 
dence  that  this  "coupling  effect"  rarely  occurs  in  practice  CBud803/ 
yet  the  strategy  does  not  currently  handle  such  situations.  The  con¬ 
cepts  of  creation  conditions/  propagation  conditions,  and  influence  do 
provide  a  framework  in  which  such  errors  can  be  discussed  and 
analyzed  Such  work  is  currently  in  progress. 

Point  (7)  suggests  the  direction  of  research  needed. 

<1)  Program  errors  need  to  be  categorized  and  creation  conditions 
developed  for  each  category 

<2)  Loops  other  than  accumulation  loops  need  to  be  analyzed  as  to 
their  error  propagation  characteristics. 
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(3)  Automatic  method*  for  developing  propagation  conditions  need  to 
be  developed. 

(4)  Methods  need  to  be  developed  for  correlating  the  information 
obtained  from  different  computations.  A  set  of  computations  mag 
collectively  eliminate  an  error  category  for  which  no  individual 
computation  in  the  set  can.  For  example.  consider  the  potential 
error  in  which  an  incorrect  variable  occurs  in  an  output  state¬ 
ment.  The  creation  condition  of  “all  variables  different  from  X" 
may  not  be  satisfied  on  any  one  computation,  but  over  a  set  of 
computations  all  variables  may  indeed  be  differentiated  from  X. 
This  potential  error  can  therefore  be  eliminated  by  the  collec¬ 
tive  evidence  from  the  set  of  computations. 
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